Combating reverberation in large vocabulary continuous speech recognition
نویسندگان
چکیده
Reverberation leads to high word error rates (WERs) for automatic speech recognition (ASR) systems. This work presents robust acoustic features motivated by subspace modeling and human speech perception for use in large vocabulary continuous speech recognition (LVCSR). We explore different acoustic modeling strategies and language modeling techniques, and demonstrate that robust features with acoustic modeling based on deep learning can provide significant reduction in WERs in the task of recognizing reverberated speech compared to mel-cepstral features and acoustic modeling based on Gaussian Mixture Models (GMMs).
منابع مشابه
Spoken Term Detection for Persian News of Islamic Republic of Iran Broadcasting
Islamic Republic of Iran Broadcasting (IRIB) as one of the biggest broadcasting organizations, produces thousands of hours of media content daily. Accordingly, the IRIBchr('39')s archive is one of the richest archives in Iran containing a huge amount of multimedia data. Monitoring this massive volume of data, and brows and retrieval of this archive is one of the key issues for this broadcasting...
متن کاملA Neural Network System for Large-Vocabulary Continuous Speech Recognition in Variable Acoustic Environments
Performance of speech recognizers is typically degraded by deleterious properties of the acoustic environment, such as mult ipath distort ion (reverberation) and ambient noise. The degradation becomes more prominent as the microphone is positioned more distant from the speaker, for instance, in a teleconferencing application. Mismatched training and testing conditions, such as frequency respons...
متن کاملOptimized Wavelet-based Speech Enhancement for Speech Recognition in Noisy and Reverberant Conditions
We present an improved speech enhancement method based on Wiener filtering in the wavelet domain for automatic speech recognition (ASR). The wavelet coefficients that are contaminated by the effects of late reflection and background noise are filtered using a Wiener gain. We optimize the wavelet parameters for speech, background noise and late reflection to achieve a better estimate of the Wien...
متن کاملA Study on Combined Effects of Reverberation and Increased Vocal Effort on ASR
This study analyzes the individual and combined effect of room reverberation and increased vocal effort on automatic speech recognition. Robustness of several state-of-the-art front-end feature extraction strategies and normalizations to these sources of speech signal variability is evaluated in the context of large and small vocabulary recognition tasks on American English and Czech speech cor...
متن کاملGeneralized Posterior Probability for Verifying Recognized Words Optimally in Microphone Array Applications
In a large vocabulary, continuous speech recognition (LVCSR) system, spoken input is converted into a string of hypothesized, possibly erroneous, words. However, the current state-of-the-art speech recognition technology is still not robust to all variability in speech signals, especially in a hands-free application. To make the signal pick-up from a speech source more immuned to noise or room ...
متن کامل